Picture for Haoyu Wu

Haoyu Wu

OmniGF: A Dual-Branch Vision-Language Framework for Unified Gaze Following

Add code
May 26, 2026
Viaarxiv icon

ForeSplat: Optimization-Aware Foresight for Feed-Forward 3D Gaussian Splatting

Add code
May 21, 2026
Viaarxiv icon

MultiWorld: Scalable Multi-Agent Multi-View Video World Models

Add code
Apr 21, 2026
Viaarxiv icon

Learning 3D Reconstruction with Priors in Test Time

Add code
Apr 04, 2026
Viaarxiv icon

Oedipus and the Sphinx: Benchmarking and Improving Visual Language Models for Complex Graphic Reasoning

Add code
Aug 01, 2025
Figure 1 for Oedipus and the Sphinx: Benchmarking and Improving Visual Language Models for Complex Graphic Reasoning
Figure 2 for Oedipus and the Sphinx: Benchmarking and Improving Visual Language Models for Complex Graphic Reasoning
Figure 3 for Oedipus and the Sphinx: Benchmarking and Improving Visual Language Models for Complex Graphic Reasoning
Figure 4 for Oedipus and the Sphinx: Benchmarking and Improving Visual Language Models for Complex Graphic Reasoning
Viaarxiv icon

Geometry Forcing: Marrying Video Diffusion and 3D Representation for Consistent World Modeling

Add code
Jul 10, 2025
Viaarxiv icon

DualTalk: Dual-Speaker Interaction for 3D Talking Head Conversations

Add code
May 26, 2025
Viaarxiv icon

MineWorld: a Real-Time and Open-Source Interactive World Model on Minecraft

Add code
Apr 11, 2025
Figure 1 for MineWorld: a Real-Time and Open-Source Interactive World Model on Minecraft
Figure 2 for MineWorld: a Real-Time and Open-Source Interactive World Model on Minecraft
Figure 3 for MineWorld: a Real-Time and Open-Source Interactive World Model on Minecraft
Figure 4 for MineWorld: a Real-Time and Open-Source Interactive World Model on Minecraft
Viaarxiv icon

Fast Autoregressive Video Generation with Diagonal Decoding

Add code
Mar 18, 2025
Figure 1 for Fast Autoregressive Video Generation with Diagonal Decoding
Figure 2 for Fast Autoregressive Video Generation with Diagonal Decoding
Figure 3 for Fast Autoregressive Video Generation with Diagonal Decoding
Figure 4 for Fast Autoregressive Video Generation with Diagonal Decoding
Viaarxiv icon

CoheDancers: Enhancing Interactive Group Dance Generation through Music-Driven Coherence Decomposition

Add code
Dec 26, 2024
Figure 1 for CoheDancers: Enhancing Interactive Group Dance Generation through Music-Driven Coherence Decomposition
Figure 2 for CoheDancers: Enhancing Interactive Group Dance Generation through Music-Driven Coherence Decomposition
Figure 3 for CoheDancers: Enhancing Interactive Group Dance Generation through Music-Driven Coherence Decomposition
Figure 4 for CoheDancers: Enhancing Interactive Group Dance Generation through Music-Driven Coherence Decomposition
Viaarxiv icon